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Detailed  knowledge  of  the  interaction  between  oxygen  and  hemoglobin  is  essential  both 
for  understanding  oxygen  transport  phenomena  and  for  testing  theories  of  protein  structure  and 
function.  The  equilibrium  reaction  is  measured  routinely  simply  to  estimate  the  position  and 
shape  of  the  oxygen  binding  curve  from,  respectively,  the  oxygen  pressure  at  half-saturation  (P50) 
and  the  index  of  cooperativity  ( i.e .,  the  Hill  number, «),  but  thermodynamic  analysis  of  individual 
binding  steps  is  more  difficult  because  of  inherent  complexities  in  the  reaction.  For  example,  the 
linearity  of  optical  absorption  with  hemoglobin  fractional  saturation  must  be  assumed  in  the 
absence  of  analytical  tools  to  prove  it.  To  test  this  assumption  and  to  extract  more  information 
from  a  single  experiment,  we  are  developing  techniques  using  rapid-scanning  spectrophotometry 
to  measure  complete  spectra  of  hemoglobin  during  an  oxygen  binding  reaction  and  singular-value 
decomposition  to  resolve  individual  components  in  the  transition.  From  these  analyses,  models 
of  the  equilibrium  reaction  are  being  derived  using  laws  of  mass  action  and  matrix  least  squares. 


BACKGROUND 

The  reversible  chemical  reaction  between  oxygen  and  hemoglobin  has  been  examined  for 
over  a  century.  Even  though  experimental  methods  have  evolved  during  this  time  from  simple 
gasometry  to  more  sophisticated  techniques  in  spectrophotometry,  the  reaction  is  still  difficult  to 
measure  precisely  or  to  interpret  reliably. 

The  first  mathematical  formulation  of  the  equilibrium  was  derived  by  Adair  as  a  mass- 
action  equation,  with  equilibrium  constants  describing  each  of  the  four  binding  steps  of  oxygen 
with  the  hemoglobin  letramer,1 
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0.25a, x  +  O.SajX2  +  0.75a3x’  +  a<x4 


(1) 


1  +  a,x  +  a^2  +  a3x3  +  a4x4 

where  Y  is  fractional  saturation,  a,  through  a4  are  the  overall  Adair  constants  (i.e.,  the  product 
of  step-wise  equilibrium  constants,  K,  to  K4),  and  x  is  the  partial  pressure  of  oxygen  (p02)  in 
solution. 

The  hemoglobin  binding  reaction  was  described  later  by  allosteric  theory  as  an  equilibrium 
between  two  end-state  quaternary  conformations,  one  corresponding  to  the  structure  of 
deoxyhemoglobin  with  low  oxygen  affinity  ( i.e .,  the  T  state)  and  the  other  corresponding  to  the 
structure  of  oxyhemoglobin  with  high  oxygen  affinity  (i.e.,  the  R  state).2  This  model  has  been 
useful  particularly  in  the  interpretation  of  the  structural-functional  mechanism  of  hemoglobin 
cooperativity,3  but  it  is  limited  in  that  it  predicts  strictly  a  two-state  system.  Recently,  partially 
liganded  species  of  hemoglobin  have  been  identified  with  reaction  energetics  intermediate  to  the 
two  end  states  but  with  quaternary  structures  still  in  either  T  or  R  conformation 4 

Partially  liganded  intermediates  are  central  to  the  mechanism  of  cooperative  oxygen 
binding,  but  they  are  particularly  difficult  to  measure  because  their  low  levels  during  the  reaction5 
result  in  a  highly  sigmoid  binding  curve.  Precise  measurements  have  been  accomplished,6-7  but 
the  mechanism  of  molecular  cooperativity  remains  controversial.  The  difficulty  originates  from 
the  oxygenation  reaction  itself,  e.g.,  the  low  level  of  partially  liganded  intermediates  combined 
perhaps  with  differences  in  the  reactivity  of  the  a  and  0  subunits*5  and/or  spectral  changes 
associated  with  the  transition  between  low-  and  high-affinity  forms.10-*5  In  addition,  at  least  two 
other  reactions  can  complicate  interpretation  of  binding  data:  (1)  dissociation  of  the  (X2P2  tetramer 
into  non-cooperative  ap  dimers6,16  and  (2)  the  redox  reaction  of  the  heme  iron  atoms,  in  which 
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an  oxygen-binding  ferrous  (Fe2*)  heme  is  oxidized  lo  a  non-oxygen-binding  ferric  (Fcn)  heme. 
Experimentally,  these  latter  effects  can  be  limited  by  using  a  hemoglobin  concentration  as  high 
as  possible  to  minimize  the  effects  of  dimers  and  by  performing  the  experiment  as  quickly  as 
possible  or  in  the  presence  of  a  methemoglobin-reducing  system  to  minimize  the  effects  of 
oxidation.17,1* 

Survey  of  techniques 

Gasometry  provides  an  accurate  method  for  obtaining  single  points  along  the  oxygen 
binding  curve.  This  technique  does  not  require  a  physical  measurement  of  the  oxyhemoglobin 
complex,  rather,  the  volume  of  bound  oxygen  is  measured  directly.  The  disadvantages  of  using 
this  technique  are  that  experience  and  skill  determine  the  precision  of  the  experiment,  a  separate 
and  precise  measurement  must  be  made  of  hemoglobin  concentration,  and  the  experiment  is  long 
and  tedious.  That  is,  the  number  of  experimental  points  along  a  single  curve  depends  on  the 
number  of  gasometric  readings  that  can  be  made  on  a  single  sample,  with  each  reading  taking 
"15  minutes.  Thus,  for  example,  an  experimental  curve  with  only  10  data  points  takes  at  least 
4  hours  to  complete:  time  to  equilibrate  10  tonometers,  each  with  the  hemoglobin  solution  at  a 
different  p02,  and  a  minimum  of  150  minutes  for  the  gasometric  readings.  During  this  interval, 
the  hemoglobin  solution  will  be  autooxidizing,  and  protein  degradation  may  occur. 

Optical  spectroscopy  is  now  usually  the  experimental  method  of  choice  because  of  its 
speed  and  simplicity,  and  because  the  reaction  produces  a  large  spectral  change  based  on 
hemoglobin  concentration.  Spectrophotometry  has  been  combined  with  tonometry  for  expediency 
(i.e.,  a  hemoglobin  spectrum  can  be  recorded  faster  than  a  gasometric  reading  can  be  made). 
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However,  like  the  gasometric  method,  time  is  taken  for  equilibration,  using  cither  separate 
tonometers  for  each  data  point  or  a  single  tonometer  or  reaction  cell  that  is  equilibrated  step-wise 
at  increasing  p02.  In  either  case,  the  number  of  data  points  that  can  be  measured  along  each 
curve  is  limited,  and  the  experiment  is  still  time-consuming.  The  primary  advantage  of  using 
tonometry  is  that  equilibrium  is  verified  at  each  step  (i.e.,  at  each  data  point)  along  the  curve. 

In  contrast  to  tonometry,  the  automatic  method  of  Imai  was  developed  to  monitor  the 
absorbance  of  a  hemoglobin  solution  as  p02  changed  continuously.7,19  This  not  only  produced 
an  equilibrium  curve  rapidly  (e.g.,  typically  20-60  minutes  for  native  human  hemoglobin, 
depending  on  solution  conditions),  thus  limiting  the  effects  of  methemoglobin  formation,  but  also 
provided  a  large  number  of  data  points  along  each  curve.  Fractional  saturation  of  hemoglobin 
is  calculated  directly  from  the  spectral  change  of  the  hemoglobin  solution,  and  p02  is  measured 
polarographically  using  an  oxygen  electrode  mounted  in  the  reaction  chamber.  However,  with 
the  continuous  method,  unlike  tonometry,  equilibrium  must  be  assumed  at  each  measurement 
point  along  the  curve,  or  the  reverse  curve  must  be  measured  to  demonstrate  exact 
reproducibility,19  which  is  impossible  if  any  oxidation  has  occurred.  Also,  because  the  reaction 
happens  faster  than  non-rapid-scanning  spectrophotometers  can  scan  a  spectral  region,  the 
transition  is  usually  followed  at  a  single  wavelength.  As  with  any  two-state  model,  analysis  at 
a  single  wavelength  requires  the  further  assumption  that  the  spectral  change  is  linearly  correlated 
with  hemoglobin  fractional  saturation.  For  the  linear  relation  to  hold,  a  single  optical  transition 
with  a  constant  extinction  coefficient  for  each  binding  step  must  take  place  that,  in  this  case, 
would  represent  the  transition  from  deoxy-  to  oxyhemoglobin.  The  actual  experiment  will  be 
more  complex  if  other  reactions,  such  as  dimer  or  methemoglobin  formation,  also  occur. 
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There  is  experimental  evidence  both  in  support  of  and  in  conflict  with  the  linear  optical 
assumption,20-23  with  recent  measurements  showing  that  true  isosbestic  behavior  is  not  maintained 
during  the  hemoglobin  oxygenation  reaction.2324  Non-linear  optical  effects  have  been  reported 
from  both  diode-array  spectrophotometry,  using  the  thin-layer  tonometry  method,25  and  a  two- 
wavelength  analysis  of  stopped-flow  reactions  by  rapid  mixing  of  oxyhemoglobin  and 
dcoxymyoglobin.26 

In  another  chapter  in  this  volume,  a  technique  is  described  that  measures  oxygen 
equilibrium  binding  of  concentrated  hemoglobin  solutions  non-optically,  thus  avoiding  the  issue 
of  non-linear  spectral  events.27  In  this  chapter,  we  describe  a  method  based  on  rapid-scanning 
spectrophotometry  to  resolve  directly  unique  optical  transitions  during  continuous  hemoglobin 
oxygenation.  The  rapid-scanning  method  has  revealed  small  but  significant  spectral  changes 
during  oxygenation  in  addition  to  the  large  spectral  difference  from  the  primary  transition  of 
deoxy-  to  oxyhemoglobin 28 

The  overall  utility  of  rapid-scanning  techniques  will  depend  on  two  factors: 

•  Scanning  rates  must  be  fast  enough  to  capture  a  complete  spectrum  of  hemoglobin  at 
a  given  p02  during  the  collection  period.  However,  even  at  the  fastest  scanning  rates, 
time  elapses  during  the  measurement,  and  because  of  this,  we  refer  to  these  as 
’'pseudoequilibrium"  measurements  to  distinguish  the  possibility  of  kinetic  events. 

•  The  optical  spectrum  must  have  high  precision  to’ resolve  small  spectral  changes  from 
the  large  oxy-deoxyhemoglobin  spectral  difference. 

Spectrophotometers  are  becoming  faster,  although  whether  they  are  becoming  more  precise 
is  debatable.  Research  has  pushed  this  technology  to  the  limit,  and  new  questions  are  demanding 
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the  best  signals  that  these  instruments  can  provide.  In  parallel  with  the  increasing  speed  of 
spectrophotometers,  sophisticated  analytical  tools  for  data  reduction  are  evolving,  and  high-speed 
personal  computers  with  large  memory  capacities  are  available  for  the  first  time  that  can  handle 
the  huge  data  arrays  and  computational  requirements  of  these  experiments.  Here  we  outline 
methods  for  the  collection  and  analysis  of  hemoglobin  spectral  matrices  during  a 
"pseudoequilibrium"  oxygenation  reaction.  Mathematical  techniques  for  matrix  multicomponent 
analysis  and  singular-value  decomposition  (SVD)  are  described,  and  spectral  artifacts  that  are 
resolved  by  these  sensitive  techniques  are  discussed. 


INSTRUMENTATION 

We  have  used  an  LT  Quantum  1200  rapid-scanning  spectrophotometer  (LT  Industries, 
Inc.,  Rockville,  MD)  to  obtain  complete  visible  spectra  of  hemoglobin  solutions  during 
continuous  oxygenation  reactions.  This  instrument  uses  a  tungsten-halogen  light  source  and  a 
rapidly  oscillating  grating  that  scans  over  the  400-800-mn  range  in  200  ms.  During  this  200  ms, 
a  70-ms  dark  period  is  allowed  for  instrument  calibration.  The  light  is  then  exposed  to  the 
monochromator  for  130  ms,  during  which  time  the  entire  range  of  400-800  nm  is  scanned  in  80 
ms.  From  this  range,  we  isolate  the  portion  of  the  spectrum  from  480-650  nm,  which  is  collected 
in  -34  ms,  for  data  analysis.  To  improve  the  signal-to-noise  ratio,  at  least  four  scans  are 
collected  per  data  point  to  produce  a  single,  averaged  spectfum  with  a  time  resolution  of  800  ms. 
The  averaged  scans  are  collected  in  alternating  wavelength  directions,  with  an  equal  number  of 
scans  in  the  high-to-low  (800-  to  400-nm)  and  low-to-high  (400-  to  800-nm)  range.  The 
instrument  is  calibrated  such  that  the  scans  taken  in  either  direction  are  indistinguishable  at 
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equilibrium.  However,  instrumental  noise  in  both  the  vertical,  or  amplitude,  spectral  signal  (i.e., 
due  to  noise  in  detector  gain)  and  in  the  horizontal,  or  wavelength,  position  (i.e.,  due  to  slight 
inaccuracies  in  the  reproducibility  of  wavelength  position  from  scan  to  scan)  produces  spectral 
artifacts  that  must  be  accounted  for  in  the  final  analysis. 

A  schematic  diagram  of  the  experimental  apparatus  is  shown  in  Fig.  1.  A  temperature- 
controlled  sample  holder  was  designed  to  hold  a  custom-made  reaction  cuvette  having  a  1-cm 
light  path  and  a  fused  side  port  for  a  polarographic  oxygen  probe  (Yellow  Springs  Instruments, 
Model  5331,  Yellow  Springs,  OH).  Ultra-pure  gas  (i.e.,  >99.999%  purity  nitrogen  or  oxygen  in 
a  1: 1  Oj/Nj  mixture),  humidified  by  bubbling  in  water  at  the  same  temperature  as  that  c  xulated 
around  the  reaction  cell,  is  introduced  into  the  gas  space  at  the  top  of  the  reaction  cell  through 
a  needle  inserted  in  a  gas-tight  septum.  Venting  takes  place  through  a  second  needle  in  the 
septum.  The  solution  is  mixed  continuously  by  a  small,  magnetic  bar  spinning  just  beneath  the 
tip  of  the  oxygen  probe.  (Stirring  provides  adequate  mixing  but  can  cause  mechanical  stress  on 
the  protein,  and  evidence  of  protein  denaturation  must  be  monitored  during  the  reaction.) 

The  photodetector  signal  of  the  light  passing  through  the  cuvette  is  digitized  for  output 
by  a  16-bit  analog-to-digital  (A/D)  board  in  a  computerized  data  collection  system.  The  voltage 
signal  from  the  oxygen  electrode  is  amplified  (Yellow  Springs  Instruments,  Model  5300 
Biological  Oxygen  Monitor),  filtered  digitally,  and  relayed  to  a  second  16-bit  AID  board  in  a 
second  computerized  data  collection  system.  (Alternatively,  a  single  computer  with  true  multi¬ 
tasking  capabilities  could  be  used  to  collect  the  two  signals  simultaneously.)  The  digital  filter 
of  the  oxygen  voltage  signal  increases  the  signal-to-noise  ratio  without  changing  the  time  constant 
of  the  oxygen  probe.  The  time  constant  of  the  oxygen  probe  is  calculated  by  collecting  the 
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transient  signal  from  a  rapid  change  in  solution  p02  with  and  without  the  filter  in  line  and  fitting 
the  relaxation  to  an  exponential  equation.  For  the  binding  experiments,  the  relaxation  time  of 
the  electrode  must  be  faster  than  the  change  in  p02  of  the  solution  so  that  readings  will  reflect 
the  true  equilibrium  value  of  p02. 

Data  collection  is  under  software  control.  At  the  initiation  and  end  of  a  single  averaged 
spectral  data  set,  transistor-transistor  logic  (TTL)  pulses  are  sent  from  the  computer  controller  of 
the  spectrophotometer  to  the  A/D  board  designated  for  oxygen  voltage  readings.  During  the  lime 
between  TTL  pulses,  the  oxygen  signal  is  collected  at  a  specific  sampling  frequency.  For 
example,  at  a  frequency  of  120  Hz,  with  four  spectral  scans  averaged  at  200  ms/scan,  96  oxygen 
voltage  readings  are  accumulated  and  averaged  to  give  a  single p02  reading  corresponding  in  time 
to  the  average  spectrum  of  hemoglobin  collected  over  the  0.8-s  period 

The  voltage  output  from  the  oxygen  electrode  is  converted  to  mm  Hg  from  calibration  of 
the  output  to  the  relative  percent  of  oxygen  in  air-saturated  buffer  at  the  barometric  pressure  in 
the  laboratory  with  a  correction  for  the  water  vapor  pressure  at  the  temperature  of  the  experiment. 


METHOD  OF  EXPERIMENTATION 

The  experiment  is  begun  by  bubbling  pure  N2  gas  through  the  reaction  buffer  in  the 
sample  cell.  Deoxygenation  of  the  buffer  is  monitored  by  the  oxygen  electrode.  At  the  same 
time,  a  concentrated  solution  of  hemoglobin  (i.e.,  >2  mM  in  heme)  is  deoxygenated  separately 
in  a  temperature-controlled,  spinning  tonometer  under  a  N2  atmosphere  (Instrumentation 
Laboratory,  Model  237,  Lexington,  MA).  Once  deoxygenation  is  complete,  the  N2  gas  flow  is 
directed  above  the  surface  of  the  fluid  in  the  reaction  cell,  and  deoxyhemoglobin  is  transferred 
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from  the  tonometer  to  the  reaction  cell  by  a  gas-tight  syringe  that  has  been  flushed  previously 
with  pure  N2  gas.  The  hemoglobin  sample  is  diluted  in  the  deoxygenated  buffer  in  the  sample 
cell,  and  the  oxygen  voltage  is  monitored  to  assure  that  no  oxygen  was  introduced  during  the 
transfer.  An  initial  spectrum  is  taken  to  verify  the  concentration  of  deoxyhemoglobin  in  the 
reaction  cell;  typically,  concentrations  of  60-200  pM  in  heme  are  used  to  provide  an  adequate 
optical  signal  and  to  minimize  the  effect  of  tetramer  dissociation  on  the  measurement.  Lower 
concentrations  of  hemoglobin  might  be  used  to  lest  the  consequence  of  tetramer  dissociation  on 
the  spectral  transition  or  to  monitor  the  reaction  in  the  Soret  region  of  the  spectrum.  We  are 
unable  to  use  higher  ( i.e .,  -1  mM)  heme  concentrations  because  the  upper  limit  of  absorbance 
detection  is  set  by  the  fixed  path  length  of  the  reac'mn  cell. 

Oxygenation  is  begun  by  switching  gas  valves  from  N2  to  50%  02.  The  flow  rate  is 
controlled  by  a  gas-flow  controller  (Matheson,  Model  4360,  Newark,  CA)  to  provide  a  gentle 
stream  of  humidified  gas  over  the  surface  of  the  hemoglobin  solution.  (Direct  bubbling  of  gas 
into  the  solution  or  vigorous  gas  flow  at  the  surface  is  avoided  because  of  the  potential  for 
protein  denaturation,  which  can  cause  a  drift  in  the  optical  signal.)  The  reverse  reaction  can  be 
carried  out  on  the  same  sample  by  switching  gas  flow  back  to  pure  N2.  Depending  on  the  flow 
rate,  temperature,  solution  conditions,  hemoglobin  sample,  and  hemoglobin  concentration,  a 
complete  oxygenation  reaction  with  this  system  requires  -20-30  minutes,  and  deoxygenation, 
-40-60  minutes. 
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ANALYTICAL  PROCEDURES 


The  spectral  data  are  combined  in  a  single  absorbance  matrix  (A)  with  a  row  for  each 
wavelength  and  a  column  for  each  p02.  The  matrix  is  reduced  to  a  range  from  480-650  nm  at 
1-nm  intervals  for  a  total  of  171  wavelengths,  or  rows  (m).  The  final  size  of  A  depends  on  the 
number  of  p02  readings,  or  columns  (n),  (typically  200-400)  to  give  matrix  A  =  m  x  n.  Two 
procedures  are  used  for  analysis  of  the  A  matrix:  multicomponent  analysis  and  SVD. 

Multicomponent  Analysis 

Multicomponent  analysis  determines  the  composition  of  a  mixture  of  components  when 
the  spectrum  of  each  of  ihe  pure  parent  species  is  known.  The  general  procedure  involves  first 
taking  a  spectrum  of  each  of  the  parent  species.  Second,  the  spectrum  of  the  mixture  is  taken. 
In  our  case,  multiple  spectra  are  taken  during  the  course  of  a  single  experiment  (i.e.,  the  number 
of  spectra  for  multicomponent  analysis  in  each  experiment  is  equal  to  «).  Third,  a  least-squared 
linear  curve-fitting  procedure  is  used  to  minimize  the  norm  (sum  of  squares)  of  the  residuals  and 
to  obtain  the  best  fit  combination  of  spectra  that  comprise  the  mixture.  The  procedure  uses  the 
Moore-Penrose  pseudoinverse  c*-  a  table  of  the  parent  spectra  (M),  where  each  spectrum  is  in  a 
separate  column.  The  inversion  returns  a  p  x  m  matrix  (C),  where  p  is  the  number  of  parent 
species.  Only  C  and  the  experimental  spectra  (A)  are  needed  for  the  curve  fitting.  The 
procedure  is  as  follows: 

•  Compute  C  =  (MTM)'1MT,  the  Moore-Penrose  pseudoinversc  of  M. 
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•  Compute  P  =  CA,  where  a  column  of  P  contains  the  amounts  of  the  various  parent 
compounds  at  the  corresponding  p02.  These  are  normalized  to  compute  percentages. 

Parent  spectra  have  been  obtained  on  the  rapid-scanning  spectrophotometer  for 
oxyhemoglobin  (oxyHb),  deoxyhemoglobin  (dcoxyHb),  and  mcthemoglobin  (metHb).  Using 
these  parent  spectra  to  create  M,  multicomponent  analysis  is  performed  on  each  spectrum  (r.e., 
each  column)  in  A  to  obtain  an  estimate  of  the  percentages  of  oxyHb,  deoxyHb,  and  metHb  in 
each  spectrum.  Fractional  saturation  (Y)  is  calculated  from  this  analysis  as, 

Y  =  ftoxyHb  /  (ftoxyHb  +  ^deoxyHb)  (2) 

in  which  the  percent  contribution  of  metHb  is  excluded.  The  change  in  percent  metHb  at  each 
step  is  used  to  evaluate  the  rate  of  change  of  metHb  formation  (AmetHb/At)  as  a  function  of  Y. 
It  should  be  emphasized  that  these  evaluations  are  only  estimates,  because  to  be  accurate,  the 
multicomponent  analysis  procedure  must  include  a  parent  spectrum  for  every  optical  species  in 
the  mixed  spectrum.  To  determine  the  number  of  optical  species  in  matrix  A,  SVI)  is  employed 

SVD  and  Matrix  Least  Squares 

The  purpose  of  this  section  is  to  describe  two  closely  related  computer-based  techniques 
that  place  stringent  demands  on  the  quality  of  spectrophotometric  data.  These  techniques  are 
sensitive  enough  to  pick  up  signals  below  the  noise  level  of  a  single  spectrum.  The  trouble  is 
that  lamps,  gratings,  and/or  drive  chains,  as  well  as  experimental  designs,  can  deliver  their  own 
signals,  which  can  confound  the  process  one  is  trying  to  measure.  For  example,  in  the  next 
section,  we  describe  spurious  signals  that  can  arise  simply  because  one  is  doing  kinetics  (or 
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"pseudoequilibrium"  reactions)  instead  of  step-wise  equilibrium,  and  in  the  final'section,  we 
describe  an  artifactual  spectral  component  that  contributes  <0.05%  of  the  total  signal. 

Let  us  keep  to  the  context  of  the  OAygenation  of  hemoglobin  as  measured  by 
spectrophotometers,  although  these  techniques  apply  to  all  manner  of  spectra  and  processes.29-50 
The  data  consist  of  a  spectrum  (from  480  to  650  nm  in  1-nm  steps  to  give  171  points  in  all) 
collected  at  each  value  of  p02  As  an  example,  if  log10(p02)  ranges  from  -1  to  2  in  steps  of 
0.015,  which  is  0.1  to  100  mm  Hg  in  increasing  steps,  201  complete  spectra  will  be  recorded  in 
all.  The  data  are  stored  in  matrix  A  with  171  rows  and  201  columns.  Each  column  of  A  is  a 
spectrum  of  hemoglobin  at  a  fixed  p02,  and  each  row  of  A  is  a  hemoglobin  oxygenation  curve 
at  a  fixed  wavelength. 

The  first  question  is:  How  many  independent  spectra  and  oxygen  processes  are  we  looking 
at?  With  no  notion  of  chemical  mechanism,  we  can  obtain  a  minimum  number  of  independent 
spectra  necessary  to  explain  all  the  data.  That  is,  we  can  find  the  least  number  of  spectra  needed 
to  combine  in  various  ratios  to  obtain  an  adequate  representation  of  all  the  observed  spectra 
This  same  number  serves  as  a  minimum  number  of  oxygenation  processes  to  explain  all  the 
observed  titration  curves  (rows  of  A).  This  number  is  called  the  rank  of  A,  and  there  are  many 
ways  to  estimate  it.  We  chose  a  standard  matrix  operation  called  SVD,  because  the  output  from 
SVD  will  serve  us  in  other  ways.  SVD  decomposes  the  matrix  A  into  three  factors, 

A  =  USVT,  .  (3) 

such  that 

UTU  =  VTV  =  I,  (4) 

and  S  is  diagonal,  s,,  t  s21  £  sJ} ....  In  our  case,  the  sizes  of  the  factors  are 
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U  and  S:  171  x  171,  V:  201  x  171. 


The  numbers  on  the  main  diagonal  of  S,  sorted  in  descending  order,  are  the  singular  values  of 
A.  The  relations  in  Eq.  (3)  enable  us  to  think  of  A  as  a  sum  of  a  few  special  components, 

A  =  (U  column  1)jaj(V  column  1)T  +  (U  column  2)s2I(V  column  2)T  +  ...  (5) 

The  first  terra  of  Eq.  (5),  i.e.,  (U  column  1)j, /V  column  1)\  is  a  rank  one  matrix,  the  same  size 
as  A.  But  being  rank  one,  it  contains  only  one  "spectrum"  (U  column  1)  in  its  columns,  varying 
only  in  scale.  Likewise,  it  contains  only  one  "titration  curve"  (V  column  1)  in  its  rows,  again 
varying  only  in  scale.  Furthermore,  this  rank  one  matrix,  of  all  possible  rank  one  matrices,  is  the 
best  fit  to  A  in  the  least-squares  sense,  and  the  magnitude  of  su  tells  how  much  of  A  is 
explained  by  this  optimal  rank  one  term.  Similar  descriptions  hold  for  the  subsequent  rank  one 
terras  in  Eq.  (5),  each  of  which  is  a  best  rank  one  fit  to  A  minus  the  previous  terms.  When  su 
is  small  enough  to  be  regarded  as  noise,  and  when  (U  column  i)  and  (V  column  i)  fail  to  show 
anything  that  looks  like  signal,  we  can  discard  those  parts  of  U,  S,  and  V  and  keep  only  the 
minimal,  "noiseless"  representation  approximating  the  signal  in  A.  It  is  this  minimal 
representation  property  that  makes  SVD  an  appealing  tool  for  analyzing  complex  mixtures.  The 
advantages  of  this  representation  are  explained  in  Shrager. 30 

(At  this  point,  you  may  want  to  consult  a  linear  algebra  text  about  matrix  multiplication, 
transpose,  diagonal,  identity,  inverse,  and  the  Frobenius  norm,  which  we  will  refer  to  as  norm.) 
The  norm  of  S  is  the  same  as  the  norm  of  A.  You  can  estimate  the  rank  of  A  by  plotting 
logio (su)  versus  i,  most  of  which  will  be  a  smooth  curve,  almost  a  straight  line  for  small  i,  except 
for  the  first  few  values.  These  will  stand  out  above  the  others  as  signal  stands  out  above  noise, 
and  the  number  of  these  standouts  will  estimate  the  rank.  Sometimes,  it  is  difficult  to  decide 
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where  the  standouts  end,  which  is  appropriate.  The  difficulty  provides  a  sense  of  doubt  about 
concepts  such  as  rank.  One  should  not  put  too  much  confidence  in  computations  that  produce 
integer  answers  (decisions,  if  you  will)  from  data  with  a  continuous  range  of  possible  values. 
Alternate  and  possibly  conflicting  ways  of  estimating  rank  are  provided  in  previous 
publications.29,30  Rank  becomes  a  working  hypothesis,  not  a  "hard"  number. 

Having  chosen  a  rank  r,  we  can  now  express  our  goal  in  matrix  terms.  We  wish  to 
decompose  the  matrix  A  into  two  factors, 

A  =  DFt  (6) 

where  D  is  a  17 1  x  r  matrix,  its  r  columns  containing  the  spectra  that  are  changing.  The  columns 
of  D  are  plotted  versus  wavelength.  F  is  a  201  x  r  matrix  of  appearance-disappearance  curves 
for  the  spectra  in  D.  The  columns  of  F  are  plotted  versus  log,0(pO2). 

Just  as  you  must  know  two  of  the  three  numbers  when  solving  the  scalar  equation  ab  = 
c,  likewise,  you  must  know  two  of  the  three  matrices  when  solving  A  =  DFT.  By  using  computer 
modeling  in  conjunction  with  least  squares,  one  can  often  obtain  a  good  estimate  of  F  (an 
example  of  how  F  is  chosen  is  given  below)  and  then  compute  D  by  the  formula  D  =  A(FT+), 
where  (FT+)  is  the  Moore-Penrose  pseudoinverse  of  (FT).31  This  process  is  called  matrix  least 
squares,  and  programs  for  carrying  it  out  directly,  without  going  through  the  intermediate  SVD 
steps  described  below,  are  described  in  Frans  and  Harris.32  The  relation  between  A  =  DFT  (the 
matrix  least-squares  decomposition)  and  A  =  USVT  (the  SVD)  is  given  by, 

VT  =  IIFt  (7) 

and 


D  =  USII 


(8) 
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where  H  is  found  by  making  successive  guesses  at  the  Adair  parameters  that  determine  F, 
generating  F,  then  applying  matrix  least  squares, 

II  =  Vt(Ft*)  '  (9) 

The  solution  parameters  are  those  that  minimize  the  norm  of  S(VT  -  HFT). 

This  SVD-based  procedure  is  proven  in  Shrager30  to  give  exactly  the  same  result  as  direct 
matrix  least  squares.  So  what  are  the  reasons  for  using  SVD?  One  reason  is  that  the  full  U,  S, 
and  V  matrices  are  in  fact  never  used,  because  statistically  indistinguishable  results  can  be 
obtained  by  using  only  the  first  r  columns  of  U  and  V,  and  the  first  r  tows  and  columns  of  S. 
Thus,  in  terms  of  computing  effort,  VT  =  HFT  is  a  much  smaller  problem  than  A  =  DFT.  But 
equally  important,  SVD  offers  assistance  in  choosing  a  feasible  model.  Trying  to  derive  a  model 
by  looking  at  the  rows  of  A  is  often  difficult,  because  the  rows  of  A  tend  to  look  alike  in  a 
restricted  wavelength  region,  and  because  small  but  independent  trends  tend  to  be  swamped  by 
larger  trends  and  even  by  noise  in  any  single  row  of  A.  But  SVD  has  two  advantageous 
properties.  First,  SVD  collects  almost  all  the  signal  in  all  the  rows  of  A  into  the  fewest  possible 
columns  of  V  (e  g.,  in  our  experiments,  we  rarely  have  to  look  past  the  fifth  column  in  V  for 
signal).  Second,  SVD  tends  to  produce  columns  of  V  of  contrasting  shape,  so  that  if  a  small  but 
significant  trend  does  not  show  up  well  in  one  column  of  V,  it  shows  up  well  in  another.  These 
contrasting  shapes  also  convey  information  about  mechanism.  If  there  are  only  two  spectra  in 
the  data  ( e.g .,  deoxyHb  and  oxyHb  with  no  distinction  between  T  and  R  states),  then  the  apparent 
rank  of  A  will  be  r  =  2,  and  only  the  first  two  columns  of  V  will  have  significant  signal. 
Furthermore,  both  columns  will  follow  the  Adair  trend,  a  single  sigmoid,  with  contrast  only  in 
the  base  level  and  scale  of  the  curves.  (F  in  this  case  consists  of  two  Adair  curves:  0  to  1  and 
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1  to  0.)  But  if,  say,  there  are  four  distinguishable  species:  deoxy  T,  deoxy  R,  oxy  T,  and  oxy 
R,  then  four  columns  of  V  will  contain  signal,  and  their  number  of  up-and-down  trends  versus 
p02  will  absolutely  preclude  a  simple  deoxy-to-oxy  mechanism. 

So  now  the  question  is:  for  a  four-species  model,  what  can  we  use  in  place  of  the  Adair 
curves  in  F?  The  Adair  curves  were  in  F  because  the  0-to-l  curve  described  the  appearance  of 
oxyHb,  and  the  l-to-0  curve  described  the  disappearance  of  deoxyHb.  Now  we  must  describe 
the  appearance  and  disappearance  of  four  species  species  =  type  of  site:  unbound  T  or  R, 
or  bound  T  or  R)  by  postulating  models  and  testing  them.  For  a  simple  example,  assume  that 
all  hemoglobin  letramers  are  in  the  T  state  for  zero  sites  or  any  one  site  bound  to  oxygen  and  in 
the  R  state  for  any  two,  three,  or  four  sites  bound,  with  no  distinction  between  the  a  and  P  sites. 
By  the  laws  of  mass  action,  the  populations  of  the  four  species  (sites)  are: 

Let  x  =  p02, 

D  as  1  +  atx  +  a2x2  +  a3x3  +  a4x\ 
y;  =  concentration  of  Hb(02),,  (i.e., 
y0  =  1/D,  and 

y;  =  apc'/D,  i  =  1:4).  Then  the  desired  populations  are: 
deoxy  T  =  y0  +  0.75y, 
oxy  T  =  0.25y, 

deoxy  R  =  0.5y2  +  0.25y3  '* 

oxy  R  =  0.5y2  +  0.7 5y3  +  y4. 

So,  as  expected,  the  oxygenation  model  will  start  at  x  =  0  with  all  deoxy  T,  finish  at  high  x  with 
almost  all  oxy  R,  with  oxy  T  and  deoxy  R  as  rising  and  falling  intermediates.  These  are  the  four 
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columns  of  F,  the  shapes  of  which  are  governed  by  the  a's.  which  are  adjusted  by  a  curve-fitting 
program  to  minimize  the  norm  of  A  -  DFT  (in  full  matrix  least  squares)  or  S(VT  -  II F1)  (in  the 
SVD-based  procedure).  The  choice  of  model  depends  on  the  combination  of  intermediates  in  the 
two  conformational  states  and  can  be  tested  by  allowing  for  the  best  fit  to  the  data. 

The  differences  between  T  and  R  spectra  are  subtle  at  best.  Only  a  sensitive  procedure 
can  hope  to  detect  them.  And  since  we  are  using  least  squares  rather  than  partial  least  squares, 
our  model  must  include  any  phenomenon  that  our  procedure  can  detect,  not  only  T  to  R 
transitions,  but  also  metHb  formation  and  dimerization,  unless  these  effects  can  be  stabilized. 
(With  some  reformulation  of  the  problem,  SVD  can  ignore  unchanging  background.)  But  to 
return  to  the  original  point,  if  our  instruments  (e.g.,  spectrophotometers  and  electrodes),  our  data 
scrubbing  (e.g.,  digital  filters),  or  our  experimental  designs  (e.g.,  kinetics  with  slow 
spectrophotometric  scans)  introduce  subtle  signals  of  their  own,  analysis  becomes  more  difficult. 

Kinetics  and  the  spectrophotometer:  a  study  in  artifact 

The  purpose  of  this  section  is  to  provide  some  corrections  for  eirors  induced  by  slow 
kinetics  (i.e.,  "pseudoequilibrium")  rather  than  step-wise  equilibrium.  Time  must  be  recorded 
along  with  p02,  absorbance,  and  even  wavelength  in  some  cases.  Spectra  take  time  to  gather. 
When  the  wavelength  range  is  scanned,  absorbance  at  each  wavelength  is  measured  at  a  different 
time.  When  several  such  spectra  are  averaged,  the  distribution  of  sample  times  is  unique  for  each 
wavelength.  To  specify  this  distribution,  some  notation  is  in  order: 
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n  =  number  of  scans  combined  to  produce  a  single  spectrum. 

Wavelength: 

w  =  wavelength  in  nm. 

Wjai,  =  minimum  wavelength  (start  of  a  forward  scan). 
ww,  =  maximum  wavelength  (start  of  a  backward  scan). 
wc  =  central  w  =  (w^  +  w^J/2. 

Time: 

t  =  time  in  some  standard  unit. 

t,  =  time  to  scan  from  w,^  to  w^,  forward  or  backward, 
tj  =  dead  time  between  successive  scans. 

tc  =  central  t,  midway  between  the  start  of  scan  1  and  the  end  of  scan  n,  including  dead 
times.  For  all  computations  below,  tc  =  0  should  be  used  to  improve  the  condition 
of  the  matrices  involved, 
c,  =  central  time  of  the  i*  scan  relative  to  te. 
f(w)  =  time  of  sample  of  w  relative  to  wc  in  a  forward  scan. 
b(w)  =  time  of  sample  of  w  relative  to  wc  in  a  backward  scan. 
t.(w)  =  time  of  sample  of  w  in  the  i*1  scan  relative  to  tc. 

Absorbance: 

* 

a  =  absorbance  in  optical  density. 

a(w,ti(w))  =  observed  a  from  the  i*  scan  at  w  (and  t;(w)). 

a(w,tc)  =  estimated  a  at  w  and  tc,  (i.e.,  a  deduced  simultaneous  spectrum  at  tc,  corrected 
for  time  dependence). 
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a'(w,t£)  =  the  first  time  derivative,  da(w,tc)/dt. 
a"(w,tc)  =  the  second  time  derivative,  i  Ja(w,tc)/dtJ. 

Our  focus  in  this  context  is  a(w,te),  but  what  we  obtain  from  the  scanning  procedure  is  a  series 
of  spectra, 

a(w,t;(w))  =  a(w,t£)  +  a'(w,tc)t,(w)  +  l/2a”(w,tc)t1J(w)  (10) 

which,  even  when  averaged,  do  not  produce  the  desired  spectrum.  Equation  (10)  is  a  three-term 
Taylor  series  expansion  of  a(w,tc)  with  respect  to  time.  (If  the  scan  is  so  slow  that  three  terms 
are  not  enough,  chances  are  that  the  wrong  experiment  is  being  performed.)  Using  formula  (10) 
as  a  model,  time  effects  can  be  corrected  for  by  fitting  the  curve  a(w,t,(w))  versus  t,(w)  to  a 
parabola  at  each  fixed  w.  One  needs  at  least  three  scans  to  do  this  because  there  are  three 
parameters,  but  10  scans  or  more  are  preferred  to  reduce  noise.  To  do  any  of  this,  the  functions 
t,(w)  must  be  known.  The  linear  model  of  t;(w)  is  offered  here  as  simple  to  apply,  but  a  more 
realistic  model  is  to  be  preferred: 
t.(w)  =  te  +  o,  +  f(w),  where 

c-,  *  [i  -  (n+l)/2](t,  +  tj),  and  (11) 

f(w)  =  t,(w)  -  C;  =  (t/2)(w-we)/(wm,x-wc). 

As  suggested  above,  tc  =  0  should  be  the  convention  in  Eqs.  (11).  From  Eqs.  (11),  for  any 
wavelength,  all  the  times  at  which  a(w,t)  was  sampled  can  be  generated.  Fitting  a  versus  t  to  a 
parabola  and  picking  off  a(w,tc)  is  then  standard  procedure.  Also,  when  all  scans  are  forward, 
the  t;(w)  are  spaced  equally,  and  the  spacing  t,  +  td  is  the  same  for  all  w,  although  the  time 
displacement  from  tc  is  not.  Still,  the  common  spacing  allows  for  considerable  economy  of 
calculation,  because  only  one  matrix  need  be  inverted  instead  of  one  for  each  w,  i.e.. 
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